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ABSTRACT 


The objective of this research is to develop a nonlinear 
regulator for an adaptive control system using backpropagating 
neural networks (BNN’s) in conjunction with a linear quadratic 
regulator (LQR). The basic concepts of adaptive control and 
the structure of neural networks are discussed. These 
concepts are integrated and the nonlinear regulator is 
derived. Simulation is conducted on a representative 
nonlinear system with both the LQR and the nonlinear 
regulator. Training of the regulator and its performance 
under varying BNN parameter values are examined. 5 

The simulation results show that the nonlinear regulator 
with BNN’s exhibits superior performance compared to the LQR 
when the nonlinearities are large. The optimization of 
regulator performance with regard to BNN parameter values is 
discussed. 

Further research is pecuieed in order to determine the 
general applicability of this regulator and to develop more 


specific guidelines for BNN parameters. 
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I. INTRODUCTION 


A. OBJECTIVE 

The objective of this thesis is to develop a nonlinear 
regulator for an adaptive control system using backpropagating 
neural networks (BNN’s) in conjunction with a linear quadratic 
regulator (LQR). Discrete time models are used to represent 


the systems for simulation and analysis. 


B. BASIC CONCEPTS OF ADAPTIVE CONTROL 

There are inherent nonlinearities in most control systems 
due to elements such as environmental changes, minor system 
component failures, random time-varying parameters, and hard 
limits caused by physical constraints. These nonlinearities 
may be handled by an optimal control design if it is 
sufficiently robust, but with large uncertainties the 
controller has to sacrifice performance in favor of robustness 
and this might be unsatisfactory. 

One method of handling these nonlinearities is to use an 
adaptive control system. An adaptive control system is one 
which continuously and automatically measures the dynamic 
characteristics of the plant, compares them with the desired 
attributes, and uses the difference to vary adjustable system 
parameters so that optimal performance can be maintained 


regardless of the nonlinearities encountered [Ref. 1: p. 792]. 


1. Performance Indexes 

The basis of adaptive control rests on the premise that 
there is some performance of the system which is optimal. 
Optimal performance is defined by specifying a performance 
index to measure the closeness of the controlled system to its 
goal. There is an infinite number of possibilities in the 
choice of a performance index. The choice of a particular 
index depends on the system and the desired results. In most 
cases the choice of index involves a compromise of minimizing 
the system costs while maximizing the system performance. 

The major drawback of performance indexes is while they 
specify the cost of system operation they do not give 
information about the transient response of the system [Ref. 
1: p.793]. A system that operates optimally according to the 
performance index may have undesirable transient 
characteristics. The transient response of the system must be 
analyzed to validate the choice of weighting matrices used in 
the index. 

2. Adaptive Control Systems 

An adaptive control system may include the following three 
functions: 

® Classification of the dynamic characteristics of the 
plant. 
@® Decision making based on the classification of the plant. 


@® Modification based upon the decisions made. 


If the plant parameters are not exactly correct, then the 
initial classification, decision, and modification procedures 
will be insufficient to optimize the performance index. It is 
then necessary to continuously carry out these procedures 
throughout the period of operation. This constant redesign of 
the system identifies an adaptive control system ([Ref. l: 


p.793-794]. 


C. NEURAL NETWORKS IN ADAPTIVE CONTROL 

While control theory for linear time-invariant (LTI) 
systems is a relatively mature field, nonlinear control 
systems generally must be designed on a system-to-system 
basis. Neural networks, with their inherent adaptability, 
have the potential for wide application in nonlinear control 
systems. The ability of a neural network to be trained 
suggests that a neural network may be able to successfully 
control nonlinear systems which have poor performance when 
regulated by linear time-invariant controllers. This thesis 
will investigate the use of neural networks in Roatuncuon 
with LTI control methods to construct a nonlinear regulator 
(Ref. 2: p.410]). This regulator will be implemented on a 
system that may be modeled as LTT with an added nonlinearity 
which either makes control by LTI methods inefficient or 
unstable. 

Two neural networks shall be used in the formulation of 


the regulator: one in parallel with the estimated plant 


parameters to generate a better state vector estimate and one 
in parallel with the linear control to produce an improved 
control input to the system. Chapter II discusses a general 
neural network structure and the derivation of the nonlinear 
regulator. The performance of the regulator on a particular 
system is presented in Chapter III and conclusions are given 


in Chapter IV. 


II. BNN NONLINEAR ADAPTIVE CONTROL 


A. BASIC CONCEPTS OF NEURAL NETWORKS 
Artificial neural networks are made up of elements which 
operate in a manner analogous to the most simple functions of 
biological neurons. These elements are linked in a fashion 
which may be similar to the connections within the human 
brain. Whether or not artificial neural networks actually are 
representative of the construction of the brain, they do show 
characteristics which are reminiscent of the human brain. An 
artificial neural network may be trained, it can recognize 
patterns, and it may apply its learning from past lessons to 
new data. These traits are quite limited, however they lend 
themselves to a wide field of applications. 
1. The Artificial Neuron 
The artificial neuron is meant to mimic the first- 
order characteristics of ene biological neuron. A set of 
inputs iS applied, each input representing the output of 
another neuron. Each input is multiplied by a corresponding 
linkweight and all of the weighted inputs are rece This 
sum is the activation level of the neuron. Figure 1 shows a 
detailed representation of a single neuron. 
In Figure 1 the inputs are labeled x,, X,...,X,. 


These inputs are defined collectively as the (n x 1) column 


VECtor x . Each input is 
multiplied by its 
corresponding linkweight 42,, 
(row vector a, size 


Qo, 200, 


R 


(1 x N)) and applied to the 





summation node (Ref. ai 
pp.12-14]). The summation 

Figure 1. A Neuron 
node produces a scalar output 


z“@ which may be stated in vector notation: 


Ze = ax*’:, (2-1) 
The output of the summation node, z”, is further 


processed by an activation function f, to produce the neuron's 


output signal x: 7 


ay =BE 2a (2-2) 


There are numerous activation functions such as a 
simple threshold, a hard-limited linear function, the 
hyperbolic tangent, and the sigmoid. These functions are 
shown in Figure 2. The malin purpose of the activation 
function is to serve as a nonlinear gain so that each neuron 
maps a wide range of inputs into a bounded output. 

2. Single Layer Neural Networks 
By themselves, neurons have limited capabilities, 


unless they are grouped together in layered networks. Figure 


ao’. °o.$ 
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Figure 2. Various Activation Functions 


3 shows a Single layer neural network with its inputs on the 
far left of the figure and the outputs on the right. The 
Single layer neural network only performs the vector 
multiplication; there is no activation function. 

The circles on the left of Figure 3 serve only as 
distribution points for the inputs; they do not perform aur 
calculations so they are’not neutrons. Each element of the 
input vector x is applied to each neuron. The linkweights may 
be considered as a matrix a ee m rows and n columns where m 
is the number of outputs and nm is the number of inputs. The 
number n is one greater than the dimension of the input vector 
x due to the bias term. Using equation (2-1), z is the (m x 


1) output vector: 


.; 
A 
iM 


m 
tn 
° Sa 
Bias 4 
Figure 3. Single Layer Neural Network 
z= ax. (2-3) 


The non-zero bias term is added as an input to each neuron. 
This bias input is also multiplied by its own linkweight. 
3. Multiple Layer Neural Networks 

- Neural networks with multiple layers offer greater 
computational capabilities than single layer networks. These 
networks are formed by cascading a number of single layers 
with the outputs of one layer providing the inputs to the next 
layer. These middle layers are known as hidden layers. 
Figure 4 shows a multiple layer neural network. The key to 


providing the extra power of the multiple layer networks is 


Layer 2 





Figure 4. Multiple Layered Neural Network 


that the activation function is included in the hidden layers 
of the network. Otherwise, the multiple layer network could 
be modeled by a single layer network which had a linkweight 
matrix equal to the product of the individual linkweight 
matrices [Ref. 3: p.19]. The output layer usually has the 
function f(x) = x, waiving the limits of the activation 
fumetion. 
4. Training of Neural Networks 

A network is trained so that a set of inputs will 
produce a desired set of outputs. Training is accomplished by 
applying an input vector, computing the output vector, 
comparing it to the desired vector, and modifying the 


linkweights by a predetermined algorithm. As the network is 


trained, the linkweights will converge to values which will 
produce the desired output vector. 

There are several methods of training a neural 
network, one of which is the backpropagation routine. This is 
the algorithm used to train the neural networks used in this 
thesis. The next section discusses this technique. 

5. Backpropagating Neural Networks 

The backpropagating neural network is structured as 
shown in Figure 4. The signal flows from input to output. We 
assume no connections between the neurons of a layer and no 
feedback from any  tlayer to the previous layers. | 
Backpropagation refers to the method used to adjust the 
linkweights throughout the neural network. 

The output of a neuron in the last layer (x")is used 
with a desired output (x,,) to produce an error signal (eae 
This error signal is multiplied by the first derivative of the 


activation function for that neuron. Mathematically, 


ony _ (x5) (X53 ~Xege) 


EX e;) 


(2-4) 


Then 6 is multiplied by the output from the source neuron for 
the linkweight which is to be updated and in turn is 
multiplied by a scaling factor yw, the learning rate. This 


results in 


10 


"rnc Cnet (2-5) 


and 
Arjen = aij —p Al;" ' (2-6) 
where 
@ a,“' is the linkweight from neuron i in the layer M-1 to 


neuron j in the output layer UM, 
@ 6,’ is the value of 6 for the linkweight a,’, 


@® t+1 denotes the updated linkweight. 


The linkweights in the hidden layers cannot be trained 
by this process, since there is no available target vector. 
Backpropagation trains the hidden layer linkweights by 
propagating the output error back through the network, 
adjusting the linkweights for each layer. Equations (2-4), 
(2-5), and (2-6) are still used for the hidden layers, but the 
target vector must be generated differently. The 6 is 
calculated for each neuron in the output layer, as in Equation 
(2-4). The linkweights feeding the output layer are adjusted 
using Equations (2-5) and (2-6). The 6,“/ is propagated back 
through the network to generate a value for 6 for each neuron 
in the hidden layer. These values are used to adjust the 
weights of the preceding hidden layer, all the way back to the 
linkweights that act upon the inputs. 

This is most easily shown in vector notation. The 


vector of 6’s for the output layer is defined as D,, and the 


ale 


set of linkweights for the output layer as the matrix A,,. To 
find D,,, multiply D,, by the transpose of the matrix Ay. 
Then multiply each component of this vector by the first 
derivative of the activation function for the corresponding 
neuron in the hidden layer. This yields the vector of 6’s (D, 


3) for the hidden layer. Mathematically, 


Dy-2= [Dy-1 Ave .« (feel, (2-7) 


where .* denotes component-by-component multiplication of 


arrays [Ref. 3: pp.51-53]. 


B. INTEGRATION OF NEURAL NETWORKS AND ADAPTIVE CONTROL 

As suggested earlier, the versatility of neural networks 
makes them prime candlaates for nonlinear controllers. The 
vast majority of LTI systems are able to be controlled by 
linear quadratic regulators (LORS). However, the LOR itself 
May not exhibit satisfactory performance in the presence of 
large uncertainties. That is, inaccuracies in the estimation 
of system parameters or nonlinearities in the system may cause 
the LOR to lose its optimality. In particular, nonlinear 
system dynamics can not be accounted for in LQR design. 

The proposal here is to derive a nonlinear adaptive 
regulator which uses neural networks to compensate for the 
nonlinear dynamics. This regulator’ will include two 


backpropagating neural networks (BNN’s): one for modeling and 


1 


one for control. The modeling BNN is used to find a more 
accurate state vector estimate than that found using the given 
system parameters. The control BNN modifies the control input 
by adding nonlinear terms to the LQR. 
1. Linear Quadratic Regulator 
The LOR is a state feedback controller which minimizes 
a performance index known as the cost function. Consider the 


LTI system: 


Xo, = AX,+Bu,, (2-8) 


where x = (Xi,, Xp, +++ » %,)7 igs the n-dimensional state 
vector, u, = (Ujy,, Uz, --- , U,,)” is the m-dimensional control 
vector, and A and B are the system parameters of dimensions (n 
x m) and (nm x m), respectively. The LOR is a regulator which 


minimizes the cost function: 


ine =y (xOx,+urtRu,) (2-9) 
tal 


where Q is an (n x nm) symmetric and positive semi-definite 
matrix and R is an (m x m) positive definite matrix. The 


optimal control u, is defined by 
u. = -KX, . (2-10) 


where 


eS 


K = [B'PB+R])“B‘PA. (2-11) 
Kis the (mx n) optimal feedback gain matrix and the (n x n) 


matrix P is the solution to the algebraic Riccati equation 


[Ref. 4: p.320): 


P = ATPA+Q-A‘PB[B*PB+R)] “BPA. (2-12) 


Obviously, the LQR does not take nonlinear system. 
dynamics into consideration, but it does give a good baseline 
for comparison to the nonlinear regulator derived in the next 
section. 
| 2. Backpropagating Neural Network Design 

The system with the nonlinear regulator is shown in 
Figure 5. The modeling and control BNNs are labeled BNNM and 
BNNC, respectively. The modeling BNN is connected in parallel 
with the estimated plant parameters, labeled A, and B,. The 
control BNN is connected in parallel with the LOR, labeled K. 
The block labeled D denotes a unit time delay. 

The nonlinear plant will be modeled as a linear time- 


invariant system with a nonlinearity added: 


ZX... = AX,+Bu,+v, (2-13) 


where v, is the n-dimensional nonlinear function. First, a 


linear estimate of the state equation is made: 
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Figure 5. Regulator Block Diagram 
XZ... = Aj,X¥,+B,u,. (2-14) 
The estimates of the system parameters may be computed by any 
number of methods. This linear estimate will deviate from the 
actual system output due to the nonlinear system dynamics and 
the inaccuracies in parameter estimation. 

In order to compute a more Beeirace estimate of x,,, 
the modeling BNN is used to produce an adjustment to x,,. The 
‘modeling BNN has the (m + n)-dimensional input vector (u, x) 
and the n-dimensional output vector 6x,,. Therefore, the new 


estimate of x,, 1s: 


aibs: 


Re, = Eh, +O (2-15) 
where 


6X... = G(Uu ee (2 =nG} 


The modeling BNN is trained so that the error between X,, and 
x,; 18 minimized. 

The optimal feedback gain matrix K is computed from A, 
and B, found in equation (2-14) and user-defined weighting 


Matrices Q and R. The linear control input is computed: 


u, = ~Kx,. (2-17) 


As discussed earlier, this controller is no longer optimal due 
to the nonlinear system dynamics and the parameter 
inaccuracies. The control BNN is used in order to optimize 
the control input. The control BNN has the n-dimensional 
input vector (x) and the m-dimensional output vector 6u,.. The 


control input u, is generated by 
u, = Uu,+5u, (2-18) 
where 


du, = h(x,). (2-19) 


The control BNN is trained so that the control input wu 


minimizes the cost function of equation (2-9). 


16 


and 


Or 


3. BNN Training Algorithms 
The following notation is used for both the modeling 


the control BNN: 


n refers to a particular layer of the network; 
the ith node in the nth layer is designated node(n, i); 


Mis the total number of layers in the network, including 
the input and output layers; 


N, is the number of nodes in the nth layer; 
x, 18 the output of the node(n, i) at time t; 


Bis che Linkweight £rom the nodeyma7eg/sto.the node(n + 
ed) 


functional representation of the node(n, i) is found by: 


Np-1 
Ze = » af GxPe (2-20) 
+ rst 
Zoe 4 oes pit (2-21) 


where a” is the (N, x N,,)-dimensional linkweight matrix and 


xP, = £(zhe) (2-22) 


where x’,, ig the ith element of the vector input to layer n. 


The function f(*) is chosen to be the sigmoid function 


Bae | 


f(x) (2-23) 


1l+e™~* 





for the hidden layers and 


f(x) =X (2-24) 


for the output layer. In accord with this notation, the 
dimensions N, = m+nand WN, = n in the modeling BNN, and N, = 
n and N, = min the control BNN where n and m are the number 
of columns in the system matrices A and B, respectively. 
a. Modeling BNN 

In deriving the modeling BNN, the first step is to 
determine the error function which will be used for training. 
Since the goal is to produce an accurate estimate of »);,; the 


error function is chosen to minimize the difference between 


X4, and %,7: 


EM, = = (oo, — Keer) 7 (Rey Keep) - (2-25) 


The linkweights are updated using a version of the gradient 


descent algorithm: 


(2-26) 
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where “4, 1S a step size parameter which controls the learning 
rate of the linkweights. The partial derivative of the cost 
function is computed as follows: 


OE e+ ee, 


daz y day. j 





[= (He. ~ Zeer) 7 (ee. ~ Beer) J 





= (Re4. ~Xe63) 7 (#41 ~Xe.1) (2-27) 
0aij 
o® 
= (2245 -Xe0,) 7 —— 
daiy 
where 
Oo = 
een Gee i Q (ee OX se. (2-28) 





dai.j daz. y 


Since X,, is independent of the linkweights: 








aa —_ a 3X.44 
ald, (2-29) 
= — (op aN > 2) a 
Oa; y “ 


The function g is the mapping function for the layers of the 
network described earlier. 


Therefore, 








CAs x.) ‘ (2-30) 


1S 


The computation of (dg/da) is found next [Ref. 2: 


p.412ie: 





Q 
—G(U,, %,) = AG ex; (2-31) 
aij 


where 


Nns2 


= fi(za) ) YS agi Ait (2-32) 
k=1 


The function f’ (x) denotes the first derivative of f(x) at x. 


The initial condition on A is 


(2-33) 


Therefore, the linkweight training is accomplished as: 


ais, t+1 = Qi j,t _ Poy (Be.1 ie) A’ AEX. E (2-34) 


b. Control BNN 
Finding the error function for the control BNN is 
more involved than for the modeling BNN. If the plant was an 


LTI system, the control input u, which minimizes 
jaye s (x2., PX,,,+u;Ru,) (2-35) 


is given by equations (2-10) through (2-12) [Ref. 2: p.413]. 
This implies that the LOR generates a control input which 


Minimizes equation (2-35) for each time t. This control input 


20 


is suboptimal even if the system is LTI. The nonlinearities 
in the system make the LOR even less optimal. The control BNN 
is connected in parallel with the LOR to adjust for control 
errors caused by parameter inaccuracies and nonlinear system 
dynamics. Therefore an for Function must be chosen which 
makes the output of the control BNN equal zero if the plant 
parameters are exactly expressed by A, and B and the 
nonlinearities are set equal to zero. Equation (2-35) isa 
good choice for the error function at time t, but x%,, 18 not 
available at time t. It is replaced by *,, which is found 
with the modeling BNN. The error function for the control BNN 


is now 


Ey = = (27,, P2,,,+uzRu,) . (2-36) 


The linkweight for the control BNN is defined as 
Meypeean order to differentiate it from the modeling BNN 
linkweight, a”,,. The control linkweights are updated by the 
gradient descent algorithm: 


GE ri 
Ob; ; 


Nn n 


Dy j, c+1 = Dijj,e ~ be (2-37) 





where u, 1s the learning rate parameter. 


2s 


Equation (2-19) defines the nonlinear function for 
the control BNN. This is used to calculate the training 


algorithm as follows: 





Cc 
OFen 2 _ 9 ligt? pe sulru)] 
ab?, »ab2, 2 - ; 
i,j i,j (2-38) 


zs 2%, poten + utR ou, . 
n n 
Ob:;, Ob;, j 





First, (0%,,/0b) will be derived followed by the derivation of 


(du/db). Substituting equation (2-15) into (0%,,/0b) gives: 


ob an A 
Obi, abi; 





erin" O%e,,) (2-39) 


Next, equations (2-14) and (2-18) are used: 





Bess a a. (A,X,+B,u,+8x,,,) 
Oi Buy (2-40) 


= = (A,x,+B,(u,+8u,) +8x,,1) 
25a 


but x and @ are independent of b, therefore 


Ox 
Secs Ao = (pis Geese. (2-41) 


Obj; Obj; 





Substituting in equations (2-19) and (2-16) results in 


OZ... - Oh (x,) > dg(u,, X,) | 


(2-42) 
Ob; ; " Webs Ob;' ; 
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The second partial derivative in equation (2-42) 
is the partial derivative of a function of u which is itself 
dependent upon b, therefore the derivative must be split as 


follows: 


OF... _ B, Oh (x,) y GOs, ka) Ou . Benes 
ob;’ 5 Ob," J ou, db;” 5) 


But the partial derivative of u, may be further simplified by: 


du, _ 9(u,+5u,) 
Obi’ j Obj’ j 
te) 
P aiahcdnl (2-44) 
Obi’ j | 
_ Oh(x,) 


Obs; 3 


. 


Substituting equation (2-44) into equation (2-43): 


ab ae F - Oh (x,) ‘ dg(u,,X,) Oh(x,) 
Ob;’ ; Ob; 5 du, Obs" ; 
2 [2,* dg (u,, ad Oh (x,) : 
oe: Obs" 5 


(2-45) 


Using equation (2-31), (dh(x)/db) may be computed in the same 
manner as (dg(u,x)/da). The (nm x m) matrix (dg(u,,x)/du,) is 
found by [Ref. 2: p.413]: 
Nie ‘ 
OF (Mer Fe) 2 yee Fat? | (2-46) 
jal 


ou, f=1 


Zo 


The control linkweight training algorithm may be 


summarized as: 


Q ' 
se AE | ute Ber eX) ee (2-47) 


ou, 





Lj. eer = Diy, a pf82r(Bot 


The training algorithm for both BNNs may be 


summarized as follows: 


@® observe the state vector x at time ¢, 
@® update the linkweight matrices a, using equation (2-34), 


@® compute the control input uy, with Saas one (2-17), (2-55 
and (2-19), 


@® calculate the state vector estimate %,, using equations 
(2-14), (2-15), and (2-16), 


@® update the linkweight matrices b, using equation (2-47). 


This is repeated at each successive time t, as the new state 


vector becomes available. 
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III. SIMULATION RESULTS AND DISCUSSION 


A. NONLINEAR SYSTEM DESIGN 

In order to test the LOR and the nonlinear regulator, a 
nonlinear system was required to be modeled. This system was 
modeled with a linear part which was then acted upon by a 
nonlinear function. The linear portion was artificially 
estimated to provide a more realistic case study. The linear 
system used as a baseline was first presented as a transfer 
function in continuous time, transformed to a state space 
feecentation, and then converted to discrete time. 

The chosen transfer function was unstable with poles at 


+2.00 and -1.00, a zero at +0.10, and a gain of 10: 


Y(s) = 10(s-0.1) (3-1) 


U(s) (Sins om 





The equivalent of this transfer function in state space format 


is [Ref. 1: pp.675-677]: 


0) mae 


\<. 
I) 








10 
a ae u 
2 |" 


2 aah (3-2) 


A.¥+B.u. 


IAS 


This equation is converted to discrete time using a truncated 
infinite series to calculate the state transition matrix and 
input matrix [Ref. 4: pp.123-127] with a sample time of 0.1 


seconds. The discrete time equation is: 


Ve +1 


0..2200, Juul 59 1.0533 





1.0104 0.1055 1.0500 
+ u 


(3-3) 
= Ay,+Bu,. 


Estimated values (A, and B,) were found for the purpose of 
illustrating the relative robustness of the LQR and the 
nonlinear regulator. These values were found by taking the 
members of A and B and calculating values which differed from 
the actual numbers by +10-20 percent. 

The nonlinear part of the system equation vasemode led as 
follows: 


X esqem Vy. en tE(L exp (ye 2.) ) 


Xo +1 a Vo. es, FECL SExXD (Vs 401) ) ° 


(3-4) 


The value € ig a control on the level of nonlinearity. 


B. LQR SIMULATION RESULTS 
The LQR is calculated using A, and B.,. The weighting 


matrices Q and R are set at: 


0 0 
= = (3-5) 
QO ; ‘ and R a 
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The solutions to equations (2-11) and (2-12) are: 


070771 0.1066 (3-6) 


K = [0.1775 0.5057] and P= 
0.1060 1.3859 


Figures 6 and 7 illustrate the performance of the LQR to 
a unit impulse. Figure 6 shows the system response for ¢€ = 
0.05. The state variables quickly converge to zero. Figure 
7 shows the response to the same input for e€ = 0.20. It can 
be seen that the LOR is not sufficiently robust to cause the 
State variables to converge to zero. The plant converges to 


a nonzero equilibrium value. 


C. NONLINEAR REGULATOR SIMULATION RESULTS 

Both BNN’s used here are three-layered neural networks 
with 30 nodes in the hidden layer. In the modeling BNN, N, = 
m+n= 3, N = 30, and N; = n= 2. In the control BNN, N, = 
n= 2, N, = 30, and N; = m= 1. The learning rate for both 
BNN’s was set at’ 0.05. The initial condition on the 
linkweight matrices was to fill them with small random numbers 
(normal distribution, standard deviation = 0.1). 

Figure 8 plots simulation results for the nonlinear plant 
with ¢€ = 0.05. The results are quite similar to those of 
Figure 6. For this level of nonlinearity, the nonlinear 


regulator causes the state variables to converge to zero 


45) 


System Outputs 





""e 1 2 3 a = 


Time (sec) 


Figure 6. LQR Result for ¢€ = 0.05 
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Figure 7. LOR Result for ¢€ = 0.20 : 
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faster than the LOR, but the response is slightly more 


oscillatory. 









State Variables 
: , x1 - ‘solid 
cea oS ony erg ee e*eeeeneoaeeneee 3 es=e@ees xZ°-° dashed een een 
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Figure 8. Nonlinear Regulator Result for é¢ = 0.05 


Figure 9 shows simulation results for the nonlinear plant 
with ¢€ = 0.20. Comparing these results to those of Figure 7 
shows that the nonlinear regulator reduces the state variables 
to zero in a cage where the LOR is unable to do so. While the 
LOR stabilizes the system for relatively large factors of eé, 
it is of limited utility once the state variables no longer 
converge to zero. The state variables converge to zero using 
the LOR up to € = 0.18 while the nonlinear regulator continues 
to drive the state variables to zero up to € = 0.21. #£=The 
performance of the nonlinear regulator is better than the LQR 


bone € = 0.06: 


Zag 





State Variables 
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- dashed 
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Figure 9. Nonlinear Regulator Result for € = 0.20 


1. BNN Parameter Variation 

The next area of investigation was to vary some of the 
parameters of the neural network to examine the effects upon 
the regulator. . This was accomplished by holding ail 
parameters constant with the exception of the one in question. 
The performance index W is found by summing the absolute 
values of the state variables (x,, and x,,) for the specified 
time range. The performance index W was calculated for each 
instance, again for a unit impulse over a range of 20 seconds. 

One of the major variances in the performance of 
neural networks is caused by the number of nodes in the hidden 


layer of the network. This number (N,) was varied from 10 to 


30 


40. The results are plotted in Figure 10. The straight line 
(without point markers) shows the performance of the LOR with 
no neural network. Figure 10 shows that adding more nodes to 
the hidden layer improves the performance of the regulator. 
However, when ¢€ is small, the LOR yields better performance 
than the nonlinear regulator with any number of nodes. As €é 
increases above 0.15, the performance degrades and the number 
of nodes has less of an effect. There are limits upon the 
number of nodes used. When the regulator was tried with 30. 


nodes in the hidden layer, the system became unstable. 


LOR 

10 nodes 
20 nodes 
30 nodes 
40 nodes 


W - Performance Index 





>) °o.038 o.1 0.13 O.2 0.235 


e€ - Nonlinearity Coefficient 


Figure 10. Variation in W due to Number of Nodes 


The learning rate was also varied to determine its 


effect upon regulator performance. This step size parameter 


on 


had an effect on W similar to that of varying N,. Both the 
modeling and the control BNN’s had their learning rate changed 
to the same value. Figure 11 shows the effect of this upon W. 
An interesting feature of Figure 11 is that while increasing 
the learning rate caused better performance, this was only 
true up to p = Q.05. When pp was increased beyond this, 
performance was degraded. Increasing p to 0.10 caused the 


system to become unstable. 


Learning Parameter 
+ 
* 0.903 
o 9.05 
x 0.07 


Performance Index 


WwW - 





o 0.05 o.1 o.13 O.2 oO.25 


€ - Nonlinearity Coefficient 


Figure 11. Variation in W due to Learning Rate 


2. BNN Training 
One of the advantages of this regulator design is that 


there is very little training required in order to achieve 
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convergence of the linkweight values. All simulation runs 
were completed with a single unit impulse training run. One 
of the major drawbacks to the majority of neural network 
applications is the amount of training inputs required for 
convergence [Ref. 3: p.88]. The apparent reason for this 
rapid convergence is that the BNN’s are not providing all of 
the state vector estimate and control input: they are only 
adding a correction to the linear state vector estimate and 


the LOR control input. 
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IV. CONCLUSIONS 


A. SUMMARY 

Starting with the rationale for developing an adaptive 
regulator for nonlinear systems, the development of a 
nonlinear regulator which used backpropagating neural networks 
in conjunction with a linear quadratic regulator in an 
adaptive control system was proposed. The regulator was 
derived and applied to control a representative nonlinear 


system. 


B. SIGNIFICANT RESULTS 
Simulations of the BNN-based nonlinear regulator were 
conducted on a representative nonlinear system. The main 


observations from these simulations are summarized below: 


@® The results show that the nonlinear regulator works well 
in the.control of a nonlinear system. It was also seen 
that using a LOR on the system works if the nonlinearity 
is small. 


® There are many variables involved in the regulator design 
which must be optimized by trial and error. Definitive 
rules to govern the selection of these variables would 
Significantly decrease the time to arrive at an 
appropriate controller. 


® The amount of training required for the regulator is 
minimal. One pulse is sufficient to train the network. 


® This regulator is unproven for all nonlinear systems. Its 


utility may be limited to those which may be modeled as a 
linear system with an added nonlinear component. 
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C. FURTHER RESEARCH 

The emphasis in this thesis was to develop a BNN-based 
nonlinear regulator that would use the a priori knowledge of 
system parameters to find a controller that could function 
more efficiently than one which did not utilize this 
knowledge. The neural networks used in this research only 
contained one hidden layer. While this was sufficient, the 
rules used in deriving the BNN’s for the regulator could be 
used to design neural networks with numerous hidden layers. 
Guidelines should be developed which would govern the choice 
of the number of layers used in the neural networks, the 
number of nodes necessary in the hidden layers, the learning 
parameters, and the weighting matrices for the error 


functions. 
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APPENDIX A: NONLINEAR REGULATOR SIMULATION 


A. SYSTEM DERIVATION 
A nonlinear regulator using backpropagating neural 
networks in conjunction with a linear quadratic regulator was 
designed in Section B of Chapter II. This appendix details 
the procedure used to arrive at the simulation results of 
Chapter II and includes the software written for the research. 
The representative nonlinear system chosen for examination 


has the continuous domain transfer function: 


Vil ee peels ONC Sue Oyeats) psec - 


His) } (s*iits-2) . gta (Asam 





This equation must be transformed to state space format. This 
1s accomplished using the derivation shown below [Ref. 1: 


D675" 67 7. 


¥(s) _ b,Ss2+b,s2++-.+b, stb, (A-2) 
U(s) s ®+a,s7*+~+a,__,sta, 





This may be rearranged and transformed to an nth-order system 


of linear differential equations: 


yi +ay (n-D +any = bu (7) +Db,u (72) 12 OVD. (A-3) 


36 


where y” and u” are the nth order derivatives of y and u. The 


State variable is chosen: 


oy. 
X, = Xz 

e = x 

cea (A-4) 
X" = -a_X, - A,X -- aX, + bu +b) +—+bou. 


However, X, = y may not yield a unique solution due to the 
derivatives of the forcing function. The state variable are 
redefined as 

x, = y-B,u 


x, = y-B,u-B,u = x, -B,u 
= Yab,U- Bp ueB,u a Mea pu (A-5) 


us 


x, = y) -B ou) -..- Bu = X17 B,-1U 


where 8,, 8,,...,6, are determined from 


ba =, 

B, = b,-a,B, 

B. = b,-a,B,-a,B, (A-6) 
B r Draenei ~ 4,-,P, -4,B,- 


With this choice of state variables, the following state and 


output equations are found: 


ov 


A,X+B.u 


x= 
y= Corer ou > 
where 
xy 0 AE 0 0 
XX, 0 0 1 0 
z= P Ay = ; 
a 0 0 0 dl 
| Xp t~@n ~4n-1 ~4p-2 °° Sy 
B, 
B. 
B.=|: |, ¢,=[1 0 = 0], .= 0, =e (A-8) 
B ns 
| B. 
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SIMULATION SOFTWARE 


wo 


SEEEEEEESEEESESEEEEEEESEREEE SES 


oo 


LOR .M 


This program computes a linear quadratic regulator for the 
stated system and simulates the response to a unit 
impulse. No neural network adjustment is provided. 

This is a stand-alone program. 

LT Kurt Menke, 10 June 1992 


6S SESE SESE EE EESESEESESESEESE SHEESH FE EEE ESE ES 


oe 


lear; clg 


The first part of this program converts the system 
from continuous time transfer function to discrete 
domain state space equations. 


The numbers for num and den are for the 

representative system of Chapter III. This is set up 
for a second order system with a single input. 

It may be adapted for any order system using the equations 
in Appendix A. 

for 207-1]: 

eee. = 2 |. 


5 A 


This transformation to state space follows the 
format of Appendix A. 


HP AP AP AP Os FF AP AP AP AP AP AP AP AO AP AP AP AP (1 AP AP AP AP AP AP HP AP AP OP Ol? 1 O10 


al = den(1,2); 
bows num(1,1); 
bi = num(1,2); 
Be = num(1,3); 
beta0O = bO; 


betal = bl - al*beta0d; 
= b2 -al*betal -a2*beta0; 


eos) 


% Continuous state space equations 
Ac = [ Q IL 


-a2 -al]; 
Bc = [{betal; beta2]; 
Co = [1 O08; 
De = beta0d; 
% 


sdisp(’Sample time for continuous’ ) 

dt = input(’to discrete conversion? ‘'); 
% 

% 

% Discrete state space equations 

[A, B] = c2atAe> Be sae 


% 
% 
[na,ma] = size(A); 
[nb,mb] = size(B); 


rand (’uniform’ ) 
rand (’ seed’ ,0) 


Ar, Br, As, and Bs are used to compute the "estimate" of 
A and B used in the computation of the LQR. 
This ensures that Ae and Be are within +/- 10-20% 


of A and B 
= 0.1 .* rand(na,ma) + 0.1; 
= 0.1 .* rand(nb,mb) + 0.1; = 


As and Bs are arbitrary signs which make the particular 
value of Ar/Br either +10-20% or -10-20% 

[oi Le aele. 

(-1; 1]; 

As .* Ar; 

Bs .* Br; 


lou uw fl 


Ae and Be are the estimates of A and B 
(Ar .* A) + A; ; 
(Br .* B) + B; 


Weighting matrices 
[0 03-20 Sie; 
1; 


ese ee ee ee et, a ee 
mM @ | 4) | 


Computes the LOR gain matrix (K) and the solution to the 
algebraic Riccati equation (P). 
K,P] = dlqr(Ae,Be,Q,R); 


ca o\? 
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e is the coefficient which controls the amount of 
nonlinearity seen by the system. 
= input (’Nonlinearity coefficient? '’); 


A (D A WH ol? of 


%% Recommended run time: 10 or 20 seconds. 
time = input(’Amount of system run time? '); 
% 

% 

% Initial values 

Nt = time/dt + 1; 


xhat = zeros(na,Nt); 
y = zeros(na,Nt) ; 
D4 = zeros(na,Nt); 
u = [1 zeros(1,Nt-1)]; % Impulse input at time 0. 
% 
% 
Por t = 2:Nt 
ya: ,t) = A*x(:,t-1) + B*u(:,t-1); 
¥ ; 
% State vector 
oe, t) = y(1,t) + e*(1l-exp(y(2,t))); 
2e(2, c) = y(2,t) + e*(l-exp(y(1,t)-)); 
% 
* Control input 
m(:,t) m= -K*x(:,C); 
% 


%* State vector estimate 

xhat(:,t+1) = Ae*x(:,t) + Be*u(:,t); 
end; 
% 
% 


% Performance calculation 


Oo 0 


L2Nt 
Wl = W1 + abs(x(1,t)); 
= W2 + abs(x(2,t)); 


‘“W = Wl + W2; 


% Plot of state vector response to unit impulse 
E=0:dt:time; 

mlroc@e,s(1,:),t,x(2,:) grid; 
text(0.8,0.8,num2str(W),’sc’); 

title(’Linear Quadratic Regulator’); 

xlabel(’Time (sec)’); 

ylabel(’System Output’) ; 
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ae 


% 


5 


a ee AP AP AP AP AP AP AP AP AP AP AP AP (1) AP AP AP AP AP AP AP AAP AP OP AAP AAP AAP 
8) 


lear; 


6% % 


oA? 
oA? 
oe 


NLREG .M 


This program computes a nonlinear regulator for the stated 
system and simulates the response to a unit impulse. The 
regulator uses two neural networks: BNNC and BNNM. This 
is a stand-alone program. 

LT Kurt Menke, 10 June 1992 


fe) 


S 


6 


3 


$66 EEE ESE EEE ESESEETECESESE ESE SES 


elgg: 


The first part of this program converts the system 


from continuous time transfer function to discrete 
domain state space equations. 


The numbers for num and den are for the 


representative system of Chapter III. This is set up 
for a second order system with a single input. 

It may be adapted for any order system using the equations 
in Appendix A. 

S10 “ea[O 1 = cae 
conv ( (i =277- tle, 


= 


This transformation to state space follows the 


format of Appendix A. 


al = den(1,2); 
a2 = den(1,3); 
bO = num(1,l):; 
bl = num(1,2Z); 
b2 = num(1,3); 
betadO = bO; 
betal = bl - al*beta0o; 
beta2 = b2 -al*betal -a2*beta0; 
& 
% 
$ Continuous state space equations 
Ac = [ 0 ab 
-a2 -al]; 
Bc = [betal; beta2]; 
Ge =a lao 
Dc = betad; 
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disp(’Sample time for continuous’ ) 
dt = input(’to discrete conversion? '); 


oe 


% 
$ Discrete state space equations 
[A, B] = c2d(Ac,Bc,dt); 

* 

% 

[na,ma] = size(A); 

{nb,mb] = size(B); 

mand (’uniform’ ) 

rand (‘’seed’ ,0) 


% 

$ 

$ Ar, Br, As, and Bs are used to compute the "estimate" of 
% A and B used in the computation of the LQR. 

% This ensures that Ae and Be are within +/- 10-20% 

% of A and B 

Ar = 0.1 .* rand(na,ma) + 0.1; 

Br = 0.1 .* rand(nb,mb) + 0.1; 

% 

% 

% As and Bs are arbitrary signs which make the particular 
¢ value of Ar/Br either +10-20% or -10-20% 

As = [-1 1;1 -1];: 

Bs = [-1;1]; 

Ar = As .* Ar; 

Be = Bs .* Br; 


Ae and Be are the estimates of A and B 
= (Ar .* A) + A; 
= (Br .* B) + B; 


m 


Weighting matrices 
= [0 0; 0 1]; 
~ ie 


Computes the LOR gain matrix (K) and the solution to the 
algebraic Riccati equation (P). 
K,P] = dlqr(Ae,Be,Q,R); 


Learning. rate parameters. LearnM is for BNNM and Learnc 
is for BNNC. 


AP AP AP AP —r OP OP WP O19 FD ID OF of OP I PD oP of? 1? 


LearnM = 0.05; 
Learnc = 0.05; 
% 
% 
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e igs the coefficient which controls the amount of 
nonlinearity seen by the system. 
= input(’Nonlinearity coefficient? '’); 


Recommended run time: 10 or 20 seconds. 
ime = input(’Amount of system run time? ’); 


HW AP (T AP AP OP (D oP OP 


disp(’Number of nodes in hidden layer’ ) 
Nm = input (’of modeling net (BNNM)? '); 
% 

% 

disp(’Number of nodes in hidden layer’ ) 
Neow= input(’of controlenee. (SNe Oo. ). 


% 

% 

% Initialization of the linkweight matrices 
[al,a2] = netinitm(na,Nm,na+2) ; 


[b1,b2] = netinitm(mb,Nc,nb+1); 
% 


4 

% Bias value for neural networks 
Bias = 1; 

% 

% 

% Initial values 

Nt = time/dt + 1; 
x 


= zeros(na,Nt); 
y = zeros (na,Nt); 
xbar = zeros(na,Nt+1); 
xhat = zeros(na,Nt+1); 
xdel = zeros(na,Nt+1); 
ex = zeros(na,Nt); ; 
u = (1 zeros(mb,Nt-1)]; % Impulse input at time 0 
% 
% 
% Training run 
for et = 2Z2NE 
We (ecapt) = A*x(:,t-1) + B*u(:,t-1); 
* State vector 
xe) = ae. eo + e* (l-exp(y(2,t))); 
x (2 ac) = y(2,t) + e*(l-exp(y(1,t))); 
%* Error vector 
ex(:,t) = X(:,C)=enat (: bie 
% Training of linkweights for BNNM 
[al,a2] = bpm(al,a2, (x(:,t);u(:,t) ;Bias] ,ex(:,t) ,Learnm) ; 
* Linear control input 
ubar(: , ©) = -K*x (ee): 
*¢ BNNC output an 
udel (:,t) = netm( [x(:, CC); Blasi oleweoo 
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ee 


* Control input 

iets, t) emitoar(:,t) + udel(:,t); 

%* Linear state vector estimate 

xbar(:,t+1) = Ae*x(:,t) + Betu(:,t); 

+ BNNM output 

mael(:,t+1) = netm((x(:,t); u(:,t): Bias], al, a2); 

* State vector estimate 

xhat(:,t+1) = xbar(:,t+1) + xdel(:,t+1); 

+ Training of linkweights for BNNC 

[b1,b2] erOMentclydenol,We7xei.,t),uti:,t) ,xNat(:,tt+1), 8 
Bias, P,Be,R, Learn) ; 


end; 

% 

% 

t Resetting initial values 
x = zeros(na,Nt); 

y = zeros (na,Nt) ; 

xbar = zeros(na,Nt+1); 

xhat = zeros(na,Nt+1) ; 

xdel = zeros(na,Nt+1); 

ex = zeros (na,Nt) ; 

ubar = zeros (mb,Nt) ; 

udel = zeros(mb,Nt); 

u = {1 zeros(mb,Nt-1)]; % Impulse input at time 0 
% 

, ~ 


+ Simulation run 
for t = 2:Nt 


vale, t) = A*x(:,t-1) + B*u(:,t-1); 

*%* State vector 

au, ©) —eyil,ey + 6@ (l-exp(y(2,C))); 
x(2,t) wivive,t) + 6€*(l-expiy(l,€))); 
% Error vector 

ex(:,t) = x(:,t)-xhat(:,t); 


* Training of linkweights for BNNM 
[al,a2] = bpm(al,a2, ([x(:,t);u(:,t);Bias],ex(:,t),Learnm) ; 
* Linear control input 


ubar(:,t) =m ~K*x(:,€) ; 

* BNNC output 

udel (:,t) = netm([x(:,t); Bias], bi, b2); 
& Control input 

ims, €) = ubar(:,t) + udel(:,t); 


* Linear state vector estimate 

xbar(:,t+1) = Ae*x(:,t) + Be*u(:,t); 

* BNNM output 

xdel(:,t+1) = netm([(x(:,t); u(:,t); Bias], al, a2); 
* State vector estimate 


xhat(:,t+1) = xbar(:,t+1) + xdel(:,t+1); 
* Training of linkweights for BNNC 
[b1,b2] “mimes le iog .c(:,t),u(:,t),xhat(:,t+1),... 


Bias, P,Be,R, Learn) ; 
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$ 
% 

% Performance calculation 
Wa =e 0 

W2 = 0; 

for’ t = 1:NE 


W1l = W1 + abs(x(1,t)) 
W2 = W2 + abs(x(2,t) ) 

end; 

W = W1 + W2; 

% 

% 

S Plot of state vector response to unit impulse 

t=0:dt:time; 

Plot (€,x(1,:)36,x*(27 eae: 

text (0.8,0.8,num2str(W),’sc’); 

title(’Linear Quadratic Regulator’ ); 

xlabel(’Time (sec)’); 

ylabel (’System Output’); 


© 
a 
e 
Ff 
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The following programs are the subroutines required to run 
LOR.M and NLREG.M. 


function [al,a2] = netinitm(ny,M, nu) 

% 

% Routine for initializing the neural net with 
$ small random numbers. 

% 

% Function call: f[al,a2]) = netinitm(ny,M, nu) 
% 

% where ny = number of outputs 

$ M = number of nodes in the hidden layer 
$ nu = number of inputs 

% 5 

% LT Kurt Menke, 10 June 1992 

$ 


rand (’normal’ ) 
rand (’ seed’ , 0) 

al = 0.1*rand(M,nu); 
a2 = 0.1*rand(ny,M) ; 
return 


function x3 = netm(x1,al,a2) 


% 

% Routine for calculating the output of a neural net. 

$ 

% Function call: x3 = net(x1,al1,a2) 

$ 

% where x1 = neural net input vector 

t al = linkweight matrix from input to hidden layer 
s a2 = linkweight matrix from hidden layer to output 
$ 

% LT Kurt Menke, 10 June 1992 

$ 

x2 = sigmoid(al*x1) ; 

x3 = a2Z*x2; 

return 
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Funct 1OnN v= sagmomartes) 


% 

% Routine for calculating the sigmoid of a vector input 
% 

% Function call: y = szomeid (x) 
% 

% where x = vector input 

% y = vector output 

% 

% LT Kurt Menke, 10 June 1992 

% 

y=1 ./ (1+exp(-x)); 

return 


function y = dsig(x) 


% 

$ Routine for calculating the first derivative of the 
% Sigmoid of a vector input 

% | 

% Function call: y = dsig(x) 

% . 

% where x = vector input 

% y = vector output 

% 

% LT Kurt Menke, 10 June 1992 

% 

temp = exp(-x); 

y = temp./ (1 + 2*temp + temp.*temp); 
return; . 
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function [al,a2] = bpm(al,a2,x1,ex,mu) 


Function 


where al 
a2 
> eal 
ex 
mu 


al*x1; 


a2*X2; 


Ww N en AP AP AP AP AP AP AP AP AP AAP AAP AAP AAP 
WJ NO 


Routine for updating the linkweight matrices for BNNM. 


call: [al,a2] = bpm(al,a2,x1,ex,mu) 


linkweight matrix from input to hidden layer 
linkweight matrix from hidden layer to output 
neural net input vector 

error vector for linkweight adjustment 
learning rate for BNNM 


LT Kurt Menke, 10 June 1992 


= sigmoid(z2); 


parE_al = -diag(dsig(z2) )*a2’ *ex*xl’ ; 
parE _a2 = -ex*x2’ ; 


al = al - mu.*parE_ al; 
a2 = a2 - mu.*parE a2; 
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function [b1,b2] = bpc(al,a2,b1,b2,x,u,xhat, Bias, P,B,R,mu) 
Routine for updating the linkweight matrices for BNNC. 


Function. Cadel: [b1,62] = bom(al, a2 jb) P02 2caU,, shat aa 
Bias,P,B,R,mu) 


where al = BNNM linkweights from input to hidden layer 
a2 = BNNM linkweights from hidden layer to output 
x = neural net input vector 
u = control input 
xhat = state vector estimate 
Bias = bias value for neural net 
P = matrix solution to algebraic Riccati equation 
B = system input matrix 
R = weighting factor 
mu = learning rate for BNNC 


LT Kurt Menke, 10 June 1992 


AP AP AP AP AP AP AP AP AP AP AP AP AP AP AAP AP AP OA? OA? 


xla = [x; u; Bias]; 

z2a = al*xia; 

x2a = sigmoid(z2a) ; 

23a = a2*x2a; 

% 

x1b = (x; Bias]; 

Z2b = b1i*xlb; 

x25 = SieGmoid (22) ; 

Z3b = b2*x2b; 

% 

parh_b2 = x2b’; 

parh_bi = diag(dsig(Z2b) )*b2’*x1b’ ; 
% 


della = a2*diag(dsig(z2a)); 


parg h = dellia*(sum(al’))’; 

parE _c = xhat’*P*(B+parg h) + u’*R; 
% 

deb2 DaGbc. “panna bc ; 


debi = parece pari, 


bil = bl - mu.*debl1; 
b2 = b2 - mu.*deb2; 


=) 
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